Use containerd restart manager to monitor services #3168

kkroo · 2018-08-26T17:12:43Z

- What I did
Adds system service uptime monitor that restarts stopped system services periodically (default is 10 seconds)

- How I did it
Integrate containerd containerd/containerd#2318
Requires CONTAINERD_COMMIT v1.2.0

- How to verify it

(ns: sshd) linuxkit-08002771aa74:/var/log# ctr -n services.linuxkit c info sshd
{
    "ID": "sshd",
    "Labels": {
        "containerd.io/restart.logpath": "/var/log/sshd.restart.log",
        "containerd.io/restart.status": "running"
    },
...

- Description for the changelog

- A picture of a cute animal (not mandatory but encouraged)

GordonTheTurtle · 2018-08-26T17:12:44Z

Please sign your commits following these rules:
https://github.com/moby/moby/blob/master/CONTRIBUTING.md#sign-your-work
The easiest way to do this is to amend the last commit:

$ git clone -b "restart_monitor" git@github.com:kkroo/linuxkit.git somewhere
$ cd somewhere
$ git commit --amend -s --no-edit
$ git push -f

Amending updates the existing PR. You DO NOT need to open a new one.

justincormack · 2018-08-28T12:31:41Z

Hi, we can't merge PRs unless they are signed-off-by as per instructions above.

I think restart policy should probably be configurable.

ijc

I'm not sure this is using the provided interfaces correctly, I can't seem to find any other examples (there seem to be none in the containerd tree, e.g. in ctr), but it certainly looks suspiciously racy to me as is.

pkg/init/cmd/service/cmd.go

ijc · 2018-08-30T15:26:07Z

containerd/containerd#2318 shows it using container.Update to set it after starting, also using WithNoRestart on teardown. However I'm not completely convinced this is intended for this use case. From containerd/containerd#2318 (comment):

The usecase is for system level containers, it will allow remote snapshotters, runtimes, and such to be kept running without any external deps on a system. Its not built for the kube or docker restart policies, just a simple internal function to keep low level system containers running, maybe something like kubelet in a container ;)

It seems like it is for containerd internal functionality not 3rd party use (as much as some linuxkit containers could be considered "system level containers").

kkroo · 2018-11-27T20:29:04Z

Addressing comments regarding race conditions. Regarding the use of this interface, it is a plugin with public interfaces and seems very well suited for improving the availability of linuxkit's system services.

medic15 · 2018-11-27T21:23:20Z

This needs to be configurable for each service. Not all those who wander are lost. Not all services that stop are broken.

kkroo · 2018-11-28T04:57:00Z

This needs to be configurable for each service. Not all those who wander are lost. Not all services that stop are broken.

Long running services are typically ones where any exit code would be anomalous. Yes, we need to continue to support one off execution. This change would obviously change that and restart each service that has exited. One off executables are still supported and probably better served under the onboot section of the runtime definition where there is reliable execution ordering for dependencies. This would provide reliable long running services to Linuxkit without the need to restart the whole system. Besides, I think it would be interesting to have a service that exits as part of its nominal behavior be monitored by this change to implement watchdogs or custom polling functionality.

ijc · 2018-11-28T10:38:57Z

@medic15 and @justincormack are correct, this needs to be configurable per service.

I still have concerns regarding this use of a plugin which is "The usecase is for system level containers" and "Its not built for the kube or docker restart policies, just a simple internal function to keep low level system containers running". Whether or not it happens to currently meet the use case here if we are using it outside its "support envelope" we have no guarantee that it will continue to be suitable in the future. I'd be happy to be overruled by the other maintainers on that though.

justincormack · 2018-11-28T10:39:22Z

At present a few of the services we run may terminate (eg rngd if no rng is supported) but this could be changed.

medic15 · 2018-11-28T14:48:59Z

My use case involves populating a Redis in-memory database using data from several disparate sources. This can't be done in the onboot phase since my init service and Redis cannot be running at the same time.

justincormack · 2018-11-28T14:51:02Z

@medic15 yes that seems a reasonable use case where it should be configurable.

Signed-off-by: Omar Ramadan <omar.ramadan93@gmail.com>

kkroo · 2018-11-28T22:01:15Z

@medic15 @justincormack I still don't understand the use case and why such a one-off can't be added to the end of the onboot with some bash script that would wait for redis to come up. Hopefully you can help clarify.

In any case, I've added a boolean flag noRestart to the runtime configuration section than can disable the restart. PTAL!

justincormack · 2018-11-28T22:04:12Z

Services are not started until all onboot containers have exited.

rn · 2019-12-19T19:49:03Z

closing this as there was no activity for over a year and some changes were requested.

GordonTheTurtle added dco/no status/0-triage labels Aug 26, 2018

ijc suggested changes Aug 30, 2018

View reviewed changes

pkg/init/cmd/service/cmd.go Outdated Show resolved Hide resolved

pkg/init/cmd/service/cmd.go Outdated Show resolved Hide resolved

pkg/init/cmd/service/cmd.go Outdated Show resolved Hide resolved

kkroo force-pushed the restart_monitor branch from d4b515d to 5eb1260 Compare November 27, 2018 20:11

GordonTheTurtle removed the dco/no label Nov 27, 2018

kkroo force-pushed the restart_monitor branch from 5eb1260 to cdf964f Compare November 27, 2018 20:47

Add containerd restart plugin over moby services

1d1c8fc

Signed-off-by: Omar Ramadan <omar.ramadan93@gmail.com>

kkroo force-pushed the restart_monitor branch from cdf964f to 1d1c8fc Compare November 28, 2018 21:56

rn closed this Dec 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use containerd restart manager to monitor services #3168

Use containerd restart manager to monitor services #3168

kkroo commented Aug 26, 2018

GordonTheTurtle commented Aug 26, 2018

justincormack commented Aug 28, 2018

ijc left a comment

ijc commented Aug 30, 2018

kkroo commented Nov 27, 2018

medic15 commented Nov 27, 2018

kkroo commented Nov 28, 2018 •

edited

ijc commented Nov 28, 2018

justincormack commented Nov 28, 2018

medic15 commented Nov 28, 2018

justincormack commented Nov 28, 2018

kkroo commented Nov 28, 2018

justincormack commented Nov 28, 2018

rn commented Dec 19, 2019

Use containerd restart manager to monitor services #3168

Use containerd restart manager to monitor services #3168

Conversation

kkroo commented Aug 26, 2018

GordonTheTurtle commented Aug 26, 2018

justincormack commented Aug 28, 2018

ijc left a comment

Choose a reason for hiding this comment

ijc commented Aug 30, 2018

kkroo commented Nov 27, 2018

medic15 commented Nov 27, 2018

kkroo commented Nov 28, 2018 • edited

ijc commented Nov 28, 2018

justincormack commented Nov 28, 2018

medic15 commented Nov 28, 2018

justincormack commented Nov 28, 2018

kkroo commented Nov 28, 2018

justincormack commented Nov 28, 2018

rn commented Dec 19, 2019

kkroo commented Nov 28, 2018 •

edited